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Abstract 

We develop a way of simulating disease spread in networks faster at the cost of some accuracy. 
Instead of a discrete event simulation (DES) we use a discrete time simulation. This aggregates 
I events into time periods. We prove a bound on the accuracy attained. We also discuss the 
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choice of step size and do an analytical comparison of the computational costs. Our error bound 
concept comes from the theory of numerical methods for SDEs and the basic proof structure 
comes from the theory of numerical methods for ODEs. 
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' 1 Introduction 

in 

^Sl ' Traditional models for analyzing the spread of an infectious disease in a population rely on faulty 

o ■ 

' assumptions of human behavior, such as random mixing, homogeneity and the lack of persistence 

of partnerships [l, 2]. Network simulation models, which remedy these drawbacks, are at the 
^ . forefront of health policy research in analyzing the control and prevention of infectious diseases 



[3, 4]. Traditional compartmental models aggregate individuals in a population into different bins 
and only count the total number of individuals in certain states; whereas, network models track 
each individual's state separately. The advantage of tracking individuals separately instead of 
as groups allows the modeler to pinpoint the origin of an outbreak and to target individuals with 
highly specific characteristics. Network models also have the unique ability to model network-based 
interventions such as contact tracing, partner delivered therapy or concurrency reduction [5, 6, 7]. 

Network models can range from simplistic and static to complex and dynamic [7, S]. Underlying 
every network model is the notion of nodes and edges. Nodes usually represent individuals in the 



'Both authors contributed equally. 



population, but can also represent cities or places [9, 10, 11]. Edges connect the nodes and are 
defined for a duration of time that sufficient transmission of the infectious agent is possible if one 
of the nodes is infected. In the simplest case there would only be one type of edge, but one can 
imagine using different types of edges to denote varying kinds of contact, such as sexual, friendship, 
etc. A simple network model of an infectious diseases would track over time the state of each node. 
The state of nodes would include the infection status of the node and might include other dynamic 
or stable endogenous or exogenous variables. 

Many network models are extensions of either an SI or SIS process. These contact processes 
are stylized models of disease spread where each node in the network is either in the susceptible 
or infected state with transmission occurring along edges. Along with the random walk and the 
voter model, the contact process is one of the prototypical stochastic processes on networks. The 
SI model is for an incurable disease where once a node is infected it will remain so indefinitely while 
in the SIS model an infected node becomes susceptible after the infection has passed or is cured. 
In this paper, we focus on SI and SIS processes on a network because of their foundational nature. 

Network models often need to be both large (in the number of nodes) and run for long periods 
of time to realistically model the long-term effects of an intervention in a large community. In 
addition such simulations require much time (in replications or in clock time), in order to get a 
sufficiently precise estimate of the spread of an infection or the effects of different interventions. 
Indeed, we often need many replications of such a simulation to estimate the variability of the 
results, a quantity which is also important when designing large clinical trials or epidemiological 
studies to ensure that the results will be statistically significant. 

Given the computational effort involved in network modeling, it is natural to find ways to speed 
up the model efficiently with minimal side effects. A practical starting point is the taxonomy of 
models in [12], which might point to a simpler type of model (e.g., a compartmental model) that 
may sometimes be appropriate. Another, more mathematical approach to speeding up stochastic 
models is using a diffusion approximation. These take a stochastic compartmental model and take 
the limit as the population goes to infinity. Using a functional central limit theorem, this limit 
becomes a stochastic differential equation [13]. However, this approach does not work for network 
models since each node in our population is unique. Moreover, it is unclear how one would define 
the limit as the population grows while keeping the network structure constant. 
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We develop a different way of approximating an infection process in a population, one which 
is specific for network modeling. Instead of using a natural discrete event simulation (DES) we 
use a discrete time simulation (DTS). This aggregates events into time periods; for example with 
monthly time steps we would only update the infection status of nodes once a month instead of 
after each infection or cure. Naturally, there is a loss in accuracy involved, especially for larger 
time steps. With large time steps, a node infected in one time step might infect a second in the 
same time step, an event that would be captured in the DES but not in the DTS. We prove a 
strong convergence result and furthermore, a bound on the accuracy attained as a function of the 
time step. This error bound allows us to do an analytical comparison of the computational costs 
for different step sizes and to estimate the maximum step size for the error that we can tolerate 
in our simulation. Superficially, our technique is similar to a method called tau-leaping, in that 
we aggregate events across time steps in a discrete event simulation to transform the model into a 
discrete time simulation [14]. However, the similarities end there. In tau-leaping, only the number 
of individuals in each infection state is kept track of, similar to a compartmental model. In a 
network model, we care about each individual's position in the network. 

Bounding the error of a discrete time simulation of a continuous time process and proving a 
strong convergence result is a common issue in the theory of numerical methods for SDEs as the 
numerical methods are usually discrete time simulations [15]. However our stochastic process is not 
derived from Brownian motion as is the case for most SDEs. Thus, while the concepts are similar, 
our proof approach is somewhat different. Our proof structure follows the convergence proofs for 
numerical methods for deterministic ODEs [16]. 

We assume our SI and SIS simulations on a network are Markov; that is, for the next state 
transition all we need to know is the current state of the network, or equivalently that the hazard 
rates are constant. In this way, we can formulate our DES as a continuous time Markov chain 
(CTMC) and our DTS as a discrete time Markov chain (DTMC). Thus, aside from the connections 
to numerical methods for SDEs and ODEs, we can also frame our problem as a way of constructing 
a DTMC from CTMC. However, our DTMC (representing our DTS) is only an approximation 
of the true DTMC achieved by sampling the CTMC at discrete time points. The true DTMC 
achieved by sampling is difficult to deal with since its transition matrix is the matrix exponential of 
the CTMC's rate matrix, whose size is already exponential in the number of nodes in the network. 
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Two other ways of constructing a DTMC from a CTMC are the embedded DTMC of a CTMC or the 
embedded DTMC of a uniformized CTMC [17]. However, these are not practical approximations. 
The embedded DTMC focuses on the sequence of states visited and loses all sense of "when" the 
transitions happen. Also, the embedded DTMC has the same number of transitions as the CTMC. 
The uniformized CTMC allows self-transitions in order to make all the sojourn times have the same 
expected duration. This leads it to have even more transitions in the same span of time as the 
original CTMC. This defeats our goal of speeding up the simulation by reducing the number of 
transitions for a given time span. To our knowledge, this is the first rigorous analysis of the error 
of a DTS approximation of a DES [18]. 

In the next section we define our DES and DTS and then describe them from a Markov chain 
perspective. Then in section 3, we state the main results and give a sketch of the proofs. The proof 
details are relegated to the appendix. Section 4 illustrates these results numerically and section 5 
analyzes the computational cost. We conclude in section 6 with a discussion of future work. 

2 Model 

In this section we describe the various parts of the model. We start by introducing a DES and our 
network notation. Then we describe how our DTS "batches" events in our DES. We then define 
notation for the CTMC and DTMC stochastic processes, before precisely defining the SI and SIS 
contact processes. We end with a theorem stating an inequality between the DTMC and the CTMC 
for the SI process. Table 1 summarizes our notation. 

Definition of a Discrete Event Simulation (DES) Discrete event simulations model an 
evolving system in continuous time as a sequence of events. Each event corresponds to a change 
in the system state. Events happen instantaneously and occur at separate time points. A set of 
timers are associated with each possible subsequent event. The time to next event is given by the 
smallest of the timers. When an event occurs, we say that the timer "fires". Immediately following 
an event, timers are updated and new possible events may be added. Conversely, in a discrete time 
simulation, time is divided into distinct intervals of length h. The status of the system changes 
instantaneously at the end of each interval. These updates approximate the cumulative changes 
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n, number of nodes 

E, set of edges of the graph 

k, max degree in the graph 

X, generic state of the graph, a binary vector of length n 
x' > X, the vector inequahty is component-wise 
xo, initial state of the graph 
I • 1 , 1-norm of a vector 

\x\, number of infected nodes when graph in state x 
S{x), set of susceptible nodes when graph in state x 
I{x), set of infected nodes when graph in state x 

n{j,x), number of (or set of edges from j to) infected nodes neighboring j when the graph 

is in state x and j is susceptible, and (or 0) otherwise 

!(•), indicator function 

w £ i}, scenario or sample point 

DES, discrete event simulation, the CTMC 

DTS, discrete time simulation, the DTMC 

NB(r,p), a negative binomial random variable giving the number of successes before r 
failures where the success probability is p. 

Geometric(p), an geometric random variable on {0, 1,2, ... } giving the number of failures 

before the first success where the success probability is p 

Exp(A), an exponential random variable with rate A. 

Yi < Y2 in distribution means Pr[Yi < y] > Pr[y2 ^ y\- 

X(t) = X{t, w), true state (i.e., state of the CTMC/DES) at time t 

h, length of each time step of the DTMC/DTS 

Xi(w) = X{ih,w), the CTMC sampled at time steps of length h 

(Ai), iid random vectors where Ai contains the random numbers to simulate from time 

(i — l)h to time ih 

(Fi), the natural filtration of (Ai) 

g{x,a), the transition function for (Xi), Xi = g{Xi-i, Ai) 

Xi{w), state of the DTMC/DTS approximation after i steps, at time ih 

g{x,a), the transition function for (Xi), Xi = g{Xi-i, Ai) 

ti = ei{w) = Xi{w) — X{ih,w), global error 

f{x,a) = {g{x,a) — x)/h, analogous to right hand side of an ODE 
D{x',x,a) = {x' — x)/h — f{x,a) = {x' — g{x,a))/h, difference operator 
di = di{w) = D{X{ih,w),X{{i — l)h,w),a), local error 
N{t) = N{t,w) = \X{t)\, number of infected nodes in X{t) 
Ni = Ni{w) = \Xi\, number of infected nodes in Xi 

Table 1: Notation and Acronyms 
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that would have occurred during that interval in continuous time. 

Graph Setup We consider a network (or graph) of n nodes with set of edges E and maximum 
degree k. The state of the graph, x G {0, 1}", is an n-dimensional binary vector describing which 
nodes are infected; = 1 if node i is infected. For a graph in state x and node j, we let x + j 
denote the state where node j is also infected. We let \x\ be the 1-norm of the vector or equivalently 
the number of infected nodes. We let S{x) denote the set of susceptible nodes and I{x) the set of 
infected nodes. For graph states x and x', we let x < x' denote the component-wise inequality, that 
the infected nodes in x' includes all those in x. Depending on the context we let n{j, x) be the set 
of edges between a susceptible node j and its infected neighbors or the number of such neighbors 
(we define it to be or if j is not susceptible). 

Batching Our approach to construct the DTS is to batch the events of the DES in a time step 
of length h. Suppose at time ih the network is in state x. Now applying one step of the DTS is 
equivalent to selecting all the timers in the DES that are set to fire in time h (i.e., before time 
(i+l)/i) and executing them at once, essentially batching all the events that happen in that interval. 
The updated state and set of timers is then ready for the next time step. This is in contrast to the 
DES where we select only the next timer to fire and process the timers sequentially. 

Definition of X{t) and Xi Our DES and DTS of the infection processes on our graph can be 
formulated as a CTMC, X{t,w), and a DTMC, Xi(w), respectively. This is due to the Markov 
property that we assume for our SI and SIS processes. Here w £ 0, represents a scenario (a.k.a. 
sample point or state of the world). Both processes start at the same initial state, X{0) = Xq = xq. 
Our goal is to compare the CTMC X{t) with the constructed DTMC {Xi), so that we can compare 
the accuracy of our DTS with our DES. The DTMC after i steps, Xi, should approximate X{ih). 
We will also define the DTMC Xi{w) = X{ih,w) that denotes the CTMC sampled at the same 
time points. Thus we will consider X{t) and (Xj) to be the true process and {Xi) the DTMC 
approximation . 

Definition of SI Process In the DES of the SI process, there are only transitions to states with 
one additional infected node, that is from a state x we can only transition to states of the form 
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X + j, where j G S{x). The transition rate from x to x + j is n{j,x). Nodes may not become 
uninfected: if node j is infected in Xi (or Xi), then j is infected in Xi^i (or (or Xi^i). In the 
DTS, multiple nodes may become infected during a transition. The probability of a node becoming 
infected in the DTS is as follows: for any susceptible node j in S{Xi), the probability of it being 
infected in is 1 — exp(— /in(j, Xj)). These probabilities are independent of each other. 

Definition of SIS Process The SIS process differs from the SI process only in that it allows 
for recovery of infected nodes (and subsequent reinfection). In addition to the transitions of the 
SI process, in the DES the graph may also transition from state x + j to state x when j G S{x). 
The associated transition rate is /i. We construct the DTS is an analogous fashion as for the 
SI DTS. For any susceptible node j in S{Xi), the probability of it being infected in ^(j+i) is 
again 1 — exp{—hn{j,Xi)). For any infected node the probability of it being recovered in -'^(j+i) 
is 1 — exp(— /i/i). We also assumes, like in the SI DTS, that the transitions across nodes are 
independent of each other. That is, only primary infections may occur and any recovered node 
cannot become reinfected in the same time step. 

Filtration and DTMC It will be convenient to explicitly construct a filtration. We let {Ai) be 
a sequence of iid random vectors and let (Fi) be the natural filtration of (Ai). We will assume that 
Xi is adapted to Fi, that is Ai, . . . ,Ai contain all the information (i.e., all the random numbers) 
to simulate the DES until time ih. For example we could let Ai be a vector of iid exponential 
random variables, with the dimension of the vector large enough so that we have enough random 
numbers for any step of the simulation. Specifically, Ai would contain the random numbers for all 
the event timers in the DES from time {i — l)h to time ih. Thus we can denote the actions of the 
DES over a time interval of length h by the function x' = g(x,a) moving the graph state from x 
to x' with the random numbers in a, allowing us to define the (Xi) recursively, Xi = g(Xj_i, Aj) 
for all i > 1. Our goal is then to construct an easy to simulate DTMC (Xi) adapted to the same 
source of random numbers, (Fi). Specifically, we seek to construct a simple function x' = g{x,a), 
defining the approximate DTMC recursively, Xi = g{Xi^i, Ai) for all i > 1. 

As an illustration, we now explicitly construct the filtration for the SI process. We let each 
Ai be a vector of \E\ iid Exp(l) random variables, one for each edge in the network, with Ai{e) 
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denoting the component corresponding to edge e. If the graph is in state x between time {i — l)h 
and ih, then we have an active timer with time Ai{e) for each edge e between a susceptible and 
an infected node. This gives the above rate of infection, n{j,x), for any susceptible node j. In 
the DES, when a node j becomes infected in the time interval [{i — l)h,ih), we update the set of 
active timers using the components of Ai'. we deactivate those active timers for edges between j 
and another infected node and activate those for edges between j and a susceptible node. In the 
DTS, we do the same except that the timers are only updated at the end of each time step. 

Theorem lA. For the SI process, X{ih) > Xi a.s. 

Proof. See Appendix. □ 

3 Results and Proof Sketch 

Our goal is to prove bounds on the global error, that is a strong convergence result. The proofs 
are given later in this section. 

Theorem 2A. For the SI process, E[|ej|] < CsiKsih when h <l where Csi = nk'^e^^~'^\ Ksi = 
{l/k){e^'^ -I) andT = ih. 

Theorem 2B. For the SIS process, IE[|ei|] < Csis^sish when h <1 where Csis = nk(kexp(k — 
2) + /i), Ksis = (l/ik + /i))(e('=+'^)^ - 1) and T = ih. 

ODE Analogy We prove a rate of strong convergence using an analogy to ODEs: we treat 
the CTMC as an ODE of the form x = f{x) and treat the DTMC as Euler's method for this 
ODE. Thus we define the right hand side function representing the incremental rate of change 
/(x, a) = {g{x, a) — x)/h, and we define the difference operator D{x' , x, a) = {x' — x)/h — f{x, a) = 
{x' — g{x,a))/h. The difference operator is designed so that for our approximation (i.e., Euler's 
method for ODEs and the DTMC in our case), D{g{x,a),x,a) = and thus the DTMC satisfies 
D{Xi,Xi-i,Ai) = a.s. for all i. While our argument holds with variable steps as in [Hi], we 
assume for simplicity that all time steps are of size h. We define the local (i.e., 1-step) error as 
di = D{X{ih),X{{i — l)h), Ai) and the global (i.e., cumulative) error as = Xj — X{ih). We first 
prove bounds on the local error. 
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Lemma 3A. For the SI process, IE[|(ij|] < Csih where Csi = nk'^ exjp{k — 2) for h < 1. 

Proof. See Appendix. □ 

Lemma 3B. For the SIS process, E[\di\] < Csish when h < 1 where Csis = nk{k exp{k — 2) 

Proof. See Appendix. □ 

Note the dependence of the constants Csi and Csis on n. While this in undesirable it cannot 
be avoided since we accumulate error for each infected node. Using the Lipschitz property of / we 
prove 0-stability of Euler's method which then gives us the desired strong convergence. 

Lemma 4A. For the SI process, E[|/(x,^i) — < Lsi\x — z\ for all x and z, where 

Lsi = k. 

Proof. See Appendix. □ 

Lemma 4B. For the SIS process, E[|/(x,yli) — /(z,^i)|] < Lsis\x — z\ for all x and z, where 
Lsis = k + li. 

Proof. See Appendix. □ 

Lemma 5. Euler's method is 0-stable, that is for any two sequences of random variables (Yi) and 
{Zi) adapted to (Fj) with Yq = Zq, 

E[\Y, - ZiW < K max E[|Z)(y„ y,_i, ^l,-) - D{Zj,Z,_i,Aj)\], 

where K = (l/L)(exp(LT) — 1), T = ih, and L is the Lipschitz constant from Lemma 4 (i-c. 
Lemma 4 A or Lemma 4B depending on the process). Depending on the process we may write K as 
Ksi or Ksis- 

Proof. The proof is given in the Appendix but essentially follows [IG, p41] using Lemma 4 for the 
Lipschitz property. □ 

We next prove Theorem 2 (we do not distinguish between Theorem 2A and Theorem 2B because 
the proofs are analogous). 
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Proof of Theorem 2. Since = Xq = xq, we can substitute Yi = Xj and Zj = Xi into Lemma 5. 
Note that D{Xj, Xj_i, Aj) = 0, and D{Xj, Xj_i, Aj) = dj. Thus, E[|ei|] < K maxi<j<iE[\dj\]. 
Applying Lemma 3 proves the claim. □ 

Note that our proofs (in particular those of Lemma 5 and Theorem 2) also work with variable 
step sizes, {hi), where we let h = maxi<j<j hi. 

4 Numerical Example 

While this is a theoretical paper we included a numerical example to see how the DTS compares 
the DES in practice. Our two test graphs were a 30x30 toroidal lattice (i.e., one that wraps around 
on all four sides) and a small world random graph. The small world graph was created by starting 
with the toroidal 30x30 lattice and then randomly distributing an extra 450 edges in such a way 
that every node has five edges. At the start of each simulation we randomly infected 10% of the 
nodes. We considered both the SI and SIS processes with an infection rate of 1 and a recovery rate 
(for the SIS process) of 0.2. Figure 1 shows the distribution of the prevalence at time 1 for the 
DES and the DTS with step sizes of 0.01 and 0.0215, while Figure 2 shows the difference between 
the average prevalence at time 1 between the DES and DTS of different step sizes. In each case 
we used 1500 replications. In Figure 2, we show a line of unit slope on the log-log plot since the 
theory we developed above suggests that the error is linear in the size of the time step. 

While the two figures describe the accuracy of the DTS, the following table compares the 
computational costs. We again consider the same cases as in Figure 1 but in addition to the 
average difference in prevalence also look at the number of events (i.e., changes in the state of a 
node); the number of time steps; and the CPU time. Of course for a DES, the number of time steps 
will equal the number of events, and for a DTS, it will equal the reciprocal of the step size (since 
we ran the simulation until time 1). The ratio of events to time steps tells us how many events are 
batched each time step on average: about five per step for the smaller step and ten per step for the 
larger steps. The difference in number of events between between the DTS and DES tells us the 
number of secondary events that are lost in the DTS because they occur in the same time step as 
the event that caused them (2-4% depending on the step size). While the simulation code is not 
optimized in any way, we nevertheless see speeds up 10x~20x faster for the DTS compared to the 
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DES. Again, we used 1500 replications for each case. The last column tells us the difference in the 
mean prevalence at time 1. We don't see the expected factor of two difference in the prevalence 
error between the two step sizes because the slope of hne 1 in Figure 2 is not a perfect fit for the 
smaller step sizes. Nevertheless, even with the larger time steps the average difference in prevalence 
is less than 1.5 percentage points. 



Lattice, SIS 



Lattice, SI 




0.4 0.6 
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Small world, SIS 
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Prevalence 

Small world, SI 
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DTS: h=0.01 
DTS: h=0.215 








0.2 



0.4 0.6 
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Figure 1: Probability density of the prevalence at time 1. We compare how well a DTS with two 
different step sizes approximates the DES. 
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Figure 2: Absolute difference of the average prevalence at time 1 of the DES and the DTS. 

5 Discussion of Computational Cost 

To complement the numerical example of the previous section, we now provide a theoretical discus- 
sion of the computational cost of the DES and DTS. A DTS until time 1 with step size h will have 
1/h steps and differ from the DES in (i.e., have a global error of) 0{n)h nodes. Dividing by n we 
find that the difference in prevalence to the DES is = 0{h). Thus we can describe the number of 
time steps 0(1/5) in terms of the prevalence error. Note how the number of time steps at a given 
level of accuracy does not depend on n, the size of the graph. Nevertheless, the work in each time 
step will still depend on n. 

In order to take this analysis further, to analyze the computational cost of the DES and the 
work per time step of the DTS, we turn to a concrete example. Specifically, we will count the 



12 



Graph 


Process 


Algorithm 


Events 


Time Steps 


CPU Time (s) 


Prev. Diff. 


Lattice 


SIS 


DES 






525.4 


525.4 


5.44 








DTS: 


h= 


=0.01 


517.4 


100.0 


0.48 


0.004 






DTS: 


h= 


=0.0215 


505.1 


46.0 


0.27 


0.012 


Lattice 


SI 


DES 






464.6 


464.6 


3.16 








DTS: 


h= 


=0.01 


461.9 


100.0 


0.48 


0.002 






DTS: 


h= 


=0.0215 


449.2 


46.0 


0.26 


0.012 


Small world 


SIS 


DES 






757.8 


757.8 


7.76 








DTS: 


h= 


=0.01 


744.6 


100.0 


0.52 


0.013 






DTS: 


h= 


=0.0215 


731.6 


46.0 


0.29 


0.014 


Small world 


SI 


DES 






660.9 


660.9 


4.54 








DTS: 


h= 


=0.01 


655.2 


100.0 


0.51 


0.005 






DTS: 


h= 


=0.0215 


644.0 


46.0 


0.28 


0.014 



Table 2: Comparison of algorithms. 



number of timers that need to be created/considered in the DTS versus the DES. For example in 
an SI process on a tree, each node can be infected exactly one way, each edge being used once as 
a timer, and thus the computational cost is the same in the DES as in the DTS, for all step sizes. 
This however is a special properties of trees, for general graphs, the DTS will use fewer timers than 
the DES. One explanation is that by batching events in the DTS, some timers are not needed. 
Specifically if nodes A and B share an edge and are infected in the same time step, then the timer 
for A infecting B is not required, while it would be required in the DES if A was infected first. A 
network interpretation of this explanation is as follows. Consider an SI process on a network where 
currently the nodes I are infected and the nodes S are susceptible. For the next time step of the 
DTS, we create timers for all edges between / and S. If nodes I' C S get infected in that step, 
then we never need to consider timers for infections (i.e., edges) between two nodes in which we 
may need to for a DES, since in a DES, /' always contains only a single node. 

To make this more concrete, we turn to a random graph where every node has k edges. (Such 
a graph can be created by placing edges uniformly at random among pairs of nodes, which don't 
already have degree k or an edge between them.) Considering the SIS process on this graph, we will 
estimate the computational costs. Suppose our simulation starts near the steady-state prevalence, 
p = \xQ\/n. Then for the next time step of the DTS we need approximately |a;o|A;(l — p) infection 
timers, one for each edge between S and /. We ignore recovery events in this discussion because they 
differ less between the DTS and the DES. For the DES, there will be approximately /i|xo|A;(l — p) 
infection events in the next step of size h, each of which we can expect to create k{l —p) additional 
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timers. Essentially creating h\xQ\k'^{l — p)^ additional timers during the time step. Thus the DES 
has a factor kh{l — p) = 0{5) more timers than the DTS in the same time period. Note that the 
difference doesn't scale with n. This analysis does not find any economies or diseconomies of scale. 

One place where there may be economies of scale for DTS and where our existing analysis is 
too conservative is when the network is composed of disjoint copies of a smaller network. In that 
case, each disjoint subnetwork acts as an independent replication of the simulation. In that case 
the central limit theorem applies and the difference in the number of infected nodes should scale 
with ^/n instead of n, and thus the difference in prevalence should decrease as l/\/n instead of 
remaining constant. 

6 Conclusion and Future Work 

In this paper we focused on network infection processes and proved error bounds for the accuracy 
of DTS as compared to DES. We specifically focused on SI and SIS processes on networks with 
bounded degree and proved a strong convergence result where the expected difference in the number 
of infected nodes is proportional to the number of nodes and the step size. This is the first such 
result. It is also a result that brings together several diverse strands of research: DES, Markov 
chains, numerical methods for ODEs, dynamic processes on networks, and epidemiology. In section 
5 we then demonstrate using a numerical example that this bound is linear with the step size and 
that DTS provide computational savings. 

There are two directions for future work. The practical direction would be to confirm the 
benefits of DTS using larger, less-stylized, simulations of disease spread, such as for example a 
simulation of HIV spread in an urban population of men who have sex with men. The theoretical 
direction would be to extend these results to the more general stochastic processes found in the 
aforementioned practical simulations. These are processes where nodes may have different sus- 
ceptibilities to infection; progress to different infectious states (e.g., an incubation period or an 
acute stage of infection); and be put on treatment, essentially allowing nodes to be in more than 
just two states (susceptible and infected). A useful framework for these more general stochastic 
processes are stochastic actor models [19] developed by social network scientists. Methods have 
been developed to parameterize such models from data but ways to speed up their simulation have 
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not. Our contributions and such future improvements allow epidemiologists to quickly simulate 
disease spread on large population over decades to evaluate the efficacy of different intervention 
alternatives. 
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7 Appendix 

Theorem lA. For the SI process, X{ih) > Xi a.s. 

Proof. Since X{0) = Xq, it suffices to show that g{x',a) > g{x,a) for all a whenever x' > x. We 
prove this by showing for all a, that g{x,a) > g{x,a) and that g{x',a) > g{x,a) for x' > x. Now, 
it is easy to see that g{x, a) > g{x, a) given x since all the timers which fire in the DTS in time h 
also fire in the DES. Finally, given x' > x, the nodes which are infected in x are also infected in x' . 
Thus, any node which is infected in the next time step in x also gets infected in x', since we use 
the same set of random numbers a. □ 

Lemma SI. E[NB(r,p)] = rp/{l —p). 

Proof. Wikipedia or your favorite probability textbook. □ 
Lemma S2. For < h < 1 and c > 0, exp(c/i) — 1 < /icexp(c). 

Proof. Using Taylor's theorem with remainder on exp(t) — 1 with t > 0, we have exp(t) — 1 = t exp(t') 
for some G [0,t]. Thusexp(t) — 1 < texp(t). Substituting t = c/i we have exp(c/i) — 1 < hcexp{ch). 
Now, h < 1 and c > imply exp(c/i) < exp(c), proving the claim. □ 

Lemma S3. For < a <b, {b - a)t - (exp(-at) - exp(-6t)) > 0. 

Proof. Note that t + exp(— t) is increasing for t > because its first derivative is 1 — exp(— t) and 
exp(— t) < 1 for t > 0. Thus (6t + exp(— — (at + exp(— at)) > 0, proving the claim for t > 0. Note 
that for t < 0, t + exp(— t) is decreasing because its first derivative is 1 — exp(— t) and exp(— t) > 1 
for t < 0. Since bt < at for t < it follows that {bt + exp{—bt)) — (at + exp{—at)) > 0, proving the 
claim for t < 0. □ 

For notational convenience we define N{t) = |-'^(t)| and Ni = \Xi\ to be the number of infected 
nodes in the DES and DTS respectively. 

Lemma S4. Consider the SI process on a network that is an infinite tree with the root node having 
m children and all other nodes having k — 1 children. Suppose that initially, only the root node is 
infected, N{0) = 1. T/ien Af(t) - iV(0) ~ NB(r,p) where r = m / {k - 2) and p = l-exp{-{k-2)t). 
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Proof. Each time a node is infected the number of edges between infected and susceptible nodes 
(i.e., the number of timers in the DES) increases by /c — 2 (the timer causing the infection is removed 
and k — 1 timers are added due to the children of the newly infected node). Then assuming that 
m = k — 2, the cumulative number of infections, N{t) — N{0), is a a pure birth process where 
the hazard rate of an infection after i infections is Xi = {k — 2)i, the so-called Yule process. It is 
well known that the distribution of this process at time t is Geometric(l — p) = NB(l,p). When 
m ^ k — 2, we let Zi, . . . , Zm denote the number of infections at time t in each subtree of the root 
node. Note that A^(t) — -/V(O) = Zi + • • • + Z.^ and that the Zi are iid. Since the negative binomial 
distribution is divisible, NB(ri,p) + NB(r2,p) ~ NB(ri + r2,p), it follows from the m = k — 2 case 
that Zi ~ NB(1/(A; - 2),p). Thus for general m, N{t) - N{0) = Zi + • • • + Z„ ~ NB(r,p). □ 

Lemma S5. For the SI process, N(t) — N{0) < NB(r,p) in distribution, where r = N{0)k/{k — 2) 
and p = 1 — exp{—{k — 2)t). 

Proof. To obtain an upper bound we maximize the hazard rate at every point in time. Since the 
maximum degree in the network is k, the hazard rate at time is at most N{0)k. After a new 
infection, the hazard rate must decrease at least by one (from the removal of the timer of the newly 
infected node) and can increase at most k — 1 (since the newly infected node has at most k edges 
and one of those is to an already infected node) . Hence the hazard rate must increase no more than 
k — 2. The hazard rates from this upper bound are those of an SI process on an infinite tree with 
each node having k — 1 children and the root node having N{0)k children. Invoking Lemma S4 
proves the claim. □ 

Lemma S6. For the SI process, N{h) — Ni < NB((A'"i — N{0))r,p) in distribution, where r = 
k/{k -2), p = l- exp{-h{k - 2)). 

Proof. The infected nodes X{h) at time h are either those originally infected, xq; those directly 
infected, Xi — xq; or those subsequently infected, X{h) — Xi. All the infected nodes in X{h) — Xi 
stem (directly or indirectly) from infection by nodes in Xi — xq and were not directly infected by 
infected nodes in xq (otherwise the timers for such nodes would have fired by time h implying that 
the nodes would be in Xi). Thus as an upper bound on the number of infected nodes in X(h) — Xi 
we can look at all the infections caused by nodes in Xi —xq over a time period of length h. Invoking 
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Lemma S5 with a graph mitiahy m state Xq 



Xi — Xq proves the claim. 



□ 



Lemma S7. IfE[\f{x,Ai)-f{z,Ai)\] < L\x-z\ forallx > z thenK[\f{x, Ai)- /(z, Ai)\] < L\x-z\ 
for all x and z. 

Proof. Note that by the triangle inequahty, \ f{x,a) — f{z,a)\ < \ f{x,a) — f{mm{x,z),a)\ + \f{z,a) — 
f {mm{x , z) , a)\ where min(x, z) is the component wise minimum of x and z. Then by our hypothesis, 
E[|/(x, yli) — < L\x—mm{x, z)\+L\z—na.m{x, z)\. Since we use the 1-norm, Ix— min(2;, 2;)| + 
\z — min(x, z)\ = \x — z\, proving the claim. □ 

Lemma S8. If x > z, then |n(j, x) — z)| < k\x — z\. 

Proof. If we do not require that n{j, x) = when j is infected, then proving the claim is simple. In 
that case, n(j, x) — n{j, z) > is the number of neighbors of j that are infected in x but not in z. 
Hence, n{j, x) — n{j, z) is the sum of the degrees of the nodes infected in x but not in z. There 
are \x — z\ such nodes, each of degree at most k, proving the claim. 

However, we require a different proof since we define n{j,x) = when Xj = 1. Suppose Xj = 0. 
Then since to x > z, Zj — 0, and n(^j,x) ^ 7t,(j, z). Hence, x) — n(^j,z)\ — • ^j' — 

1, Zji = 0}|. Suppose Xj = 1, and thus, n{j, x) = 0. Then \n{j, x) — n{j, z)\ = n{j, z). This equals 
if Zj = 1 and ■ Zji = 1}| if Zj = 0. Hence, 



^|n(j,x) -n{j,z)\ = |{(i,/) : Xj = 0, x^/ = l,Zji = 0}| + : Xj = l,Zj = 0,Zj' = 1}| 



j 



= \{U,f) ■■ Xj = 0,Xj> = l,Zj> = 0}| 



+ l{(i> j') : = 1> 3;^/ = 1, Zj> = 0}| switching j and j 



■I 



: Xj = Zj = 0,Xj/ = \,Zji = 0}| 



+ ■ Xj = Zj = l,Xj> = l,Zj> = 0}| 



since Xj > Zj 



< \{U,f) ■ Xj' = l,Zj> = 0}| <k\x-z 



Here is an alternative proof. Note, ^ - x) — n{j,z)\ 



+ \n{j,x) - n{j,z)\. 

jel{x)\l{z) j€S{x) 
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Since, x > z, when j S S{x), n{j, x) > n{j, z) and thus, 

= E Ki^S{x),jeI{x)\I{z)) 

j&I{x)\I{z) e,jGE 
jel{x)\l{z) j€l{x)\I{z) 

where s{j,x) is the number of susceptibles connected to j. Hence, 

j£l{x)\I{z) 

< k < \x — z\k. 

jei[x)\i(z) 



□ 



Lemma 3A. For the SI process, < Csih where Cgi = nk"^ exp{k — 2) for h < 1. 

Proof. Note that 

di = {Xi - X,_i)/h - [g{Xi_i,Ai) - Xi-i]/h = {g{X,_uAi)-g{Xi_^,Ai))/h. 

It suffices to bound ]E[|(ii|] for all xq to find a bound for E[|(ii|]. Now, 

E[\di\]=E[\Xi-Xi\]/h 

= E[N{h) - Ni]/h by Theorem lA 

< E[NB((iVi - N{0))r,p)]/h by Lemma S6, 

where r = k/[k — 2) and p = 1 — exp{—h{k — 2)). 

= {E[Ni] - N{0))k/{k - 2)(exp((fc - 2)h) - l))/h by Lemma SI 

< {E[Ni] - N{0))k exp{k - 2) by Lemma S2. 
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Note that 



E[iVi] - iV(0) = Yl l-exp(-/in(j, xo)) 



< ^ hn{j, xo) 



< nkh. 



(1) 



Thus, K[\di\] < nk^ 



exp(fc — 2)h. 



□ 



Since g{x,a) is in {0,1}'^, it fohows that f{x,a) is in {— 0, and hence L = 2n/h 

satisfies the inequahty E[|/(x,^i) — < L\x — z\. In Lemma 4A and 4B, our goal is to 

prove this inequality with smaller values of L. It will be convenient to denote the jth component 
of f{x,a) and g{x,a) by fj{x,a) and gj{x,a) respectively. 

Lemma 3B. For the SIS process, E[\di\] < Csish when h < 1 where Csis = nk{k exp{k — 2)+fi). 

Proof. As in the proof of Lemma 3A, it suffices to bound E[|Xi — for all xq to find the 

bound for E[|dj|]. Note that the same subset of the original infected nodes, I{xq), have a recovery 
event in the DTS as in the DES because we use the same random numbers, Ai. (We say recovery 
event, because we are not excluding the possibility that in the DES a node infected at time 0, 
recovers, and becomes reinfected by time h.) Further, the original infected nodes, I{xq), directly 
infect the same set of nodes in the DES as in the DTS, the ones whose infection timers fire before 
time h. (This is a similar argument as in Theorem lA.) We define these directly infected nodes as 
Y = gsi{x(), Ai) — Xq, where we write gsi to distinguish the transition function for the SI process. 

The difference Xi — Xi then consists of the net additional infections caused by the directly 
infected nodes Y, and the recoveries among those directly infected. (Note that reinfections of 
recovered nodes (including any in I{Y) and I{xq)) are counted among the additional infections.) 
We say net additional infections because some of these additional infections might recover in the 
same time step. Since the SI process does not allow for recoveries, the set of additional infections 
caused by those in Y is at least as large for the SI process as for the SIS process. (A precise way 
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of saying this is that if j G I{y) infects node i via some sequence of intermediate nodes in the 
SIS process, then the same occurs in the SI process, since we use the same random numbers, Ai.) 
Hence, we can use Lemma S6 to justify the same upper bound for the additional infections as in the 
proof of Lemma 3A for the SI process: NB(yr,p), where r = k/(k — 2) and p = 1 — exp{—h{k — 2)). 
For any infected node in Y, Pr[Exp(;u) < h] provides an upper bound on the probabihty of a 
recovery. Thus, 

E[\Xi -Xi\]< E[NB(yr,p)] + E[Y] Pr[Exp(^) < h] 

= E[Y]{k/{k - 2)(exp((fc - 2)h) - 1) + (1 - exp(-^/i))) by Lemma SI 
< E[y](fe/iexp(A; - 2) by Lemma S2. 

As we showed in (1) of Lemma 3A, E\Y] < nkh. Thus, 

E[|di|] = E[|Xi - XiW/h < nk{kexp{k - 2) + fi)h. 

□ 

Lemma 4A. For the SI process, E[\f{x,Ai) — f{z,Ai)\] < Lsi\x — z\ for all x and z, where 
Lsi = k. 

Proof. Using Lemma S7, we may assume that x > z. Let y = f{x,Ai) — f{z,Ai). We will show 
that E[|yj|] < \n{j,x) — n{j, z)\. By Lemma S8, 

E[\f{x,Ai) - f{z,Ai)\] = Y,n\yj\] <^Hj,x) -n{j,z)\ < k\x - z\. 

j j 

Now we show that E[|yj|] < \n{j,x) — n{j,z)\. Note that if Xj = 1, then gj{x,Ai) = 1 a.s. 
and thus fj{x,Ai) = a.s. Hence, if Xj = zj = 1, then the inequality holds because i/j = 
a.s. If Xj = 1 and Zj = 0, then n{j,x) = and yj = gj{z, Ai)/h. Futhermore, E[(7j(z,^i)] = 
1— exp(— z)) < hn{j, z), thus proving the inequality in this case. We now turn to the remaining 
case (remember xj > Zj), where Xj = zj = 0. In that case, \yj\ = \gj{x,Ai) — gj{z, Ai)\/h. Then 
gj{x,Ai) = l(mingg„(-j .J.) ^i(e) < h). Since n{j,x) 5 n{j,z), it follows that gj{x,Ai) > g{z,Ai). 
Hence, yj = if gj{z,Ai) = 1 or gj{x,Ai) = 0. In the remaining case, gj{x,Ai) = 1, gj{z,Ai) = 0, 
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and Uj = 1/h. Thus, 



Fr[yj = 1/h] = 1 - FT[gj{z,Ai) = 1 or gj{x,Ai) = 0] 
= 1 - (Pr[<7,(z,^i) = 1] +Prb,(x, Ai) = 0]) 

since gj{x,Ai) > g{z,Ai) implies that these are exclusive events 

Pr[?/j = 1/h] = 1 - (1 -ex.p{-n{j,z)h) + ex.p{-n{j,x)h)) = ex.p{-n{j, z)h) - exp(-n(j, 

Since n{j,z) < n{j,x), we know from Lemma S3 that Pr[yj = 1/h] < {n{j,x) — n{j,z))h, proving 
the inequality for the case of Xj = Zj = 0. □ 

Lemma 4B. For the SIS process, E[\f{x,Ai) — f{z,Ai)\] < Lsis\x — z\ for all x and z, where 
Lsis = k + 11. 

Proof. We take the same approach as in the proof for Lemma 4 A and use Lemma S7 to assume 
that X > z. Again, we let yj = \ fj[x,Ai) — fj{z,Ai)\. This time will show that IE[yj] < |n(j, a;) — 
n[j, z)\ + ^\xj — Zj\. Then applying Lemma S8 proves our claim: 

E[\f{x,Ai) - f{z,Ai)\] <"^\n{j,x) - n{j, z)\ + fi\xj - Zj\ < {k + fi)\x - z\. 



We now prove the inequality for yj by considering three cases (recall x> z): 

Case 1: Xj = Zj = 0. The reasoning for this case is the same as in Lemma 4A. Now yj = 
\gj{x,Ai) — gj{z, Ai)\/h £ {0,1/h}. Now, yj = if gj{z,Ai) = 1 or gj{x,Ai) = 0. Since these are 
mutually exclusive events (as x > z), 

nyj] = Fv[yj = l/h]/h = (!-(!- exp(-n(j»/i) + exp(-n(i, x)/i)))//i 
= (exp(-n(i,z)/i) - exp{-7i{j,x)h))/h. 

Since n{j,z) < n{j,x), we know from Lemma S3, E[?/j] < \n{j,x) — n{j,z)\. 

Case 2: Xj = Zj = 1. In this case yj = \gj{x,Ai) — gj{z, Ai)\/h = a.s. because the recovery 
time of node j depends only on Ai and not on the state of any other nodes. Thus the inequality 
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holds trivially. 

Case 3: Xj = 1 and Zj = 0. Here, i/j = \gj{x,Ai) — 1 — gj{z,Ai)\/h G {0, 2//i}. Specifically, 
Uj = 1/h when either gj{x,Ai) = and gj{z,Ai) = OR gj{x,Ai) = 1 and gj{z,Ai) = 1. Thus, 

Pr[yj = l//i] = (1 — exp(— exp(— n(j, 2;)/i) + exp(— — exp(— n(j, 

Similarly, yj = 2/h only if g{x, Ai) = and g{z, Ai) = 1. Thus Pr[?/j = 2//i] = (1 — exp(— — 
exp(— Thus, 

E[y,-] = Pr[y, = 2//i](2//i) + Pr^, = l/h]{l/h) 

= (2 — exp(— ^/i) — exp(— z)h))/h < fi + n{j, z) 
< \n{j,x) - n{j,z)\ + n\xj - Zj\, 

since zj = 0, Xj = 1, and n{j,x) = by definition. □ 

Lemma 5. Euler's method is 0-stable, that is for any two sequences of random variables (Yi) and 
(Zi) adapted to {Fi) with Yq = Zq, 

E[\Y, - Zi\] < K maxE[\D{Y,,Y,_i,Aj) - D{Zj, Z,_i, Aj)\], 
i<j<« 

where K = (l/L)(exp(LT) — 1), T = ih, and L is the Lipschitz constant from Lemma 4 (i-^-, 
Lemma 4A or Lemma 4B depending on the process). Depending on the process we may write K as 
Ksi or Ksis- 

Proof. As mentioned in the main body of the paper, the proof follows [16, p41]. We define Si = 
Yi — Zi and 9 = maxi<j< j E[| Z)(y^', y^_i, ylj) — D{Zj,Zj^i,Aj)\]. Thus for all j, 

9 > E[\D{Yj,Yj^i,Aj) - D{Zj, Zj^i,Aj)\] by definition, 
= K[\Sj/h — Sj^i/h — {f{Yj^i,Aj) — f{Zj^i, Aj))\] using the definitions of D and 5, 
> mSj\]/h - E[\Sj-i\]/h - E[\f{Yj_i,Aj) - f{Zj^i,Aj)\] using the triangle inequality. 

Since Aj is distributed like Ai and is independent of i^-i and Zj^i due to the filtration, we 
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can apply Lemma 4 to obtain, 9 > E[\Sj\]/h - E[\Sj-i\]/h - LE{\Sj-i\]. Thus, E[\Sj\] < Oh + 
E[|S'j_i|](l + hL). Hence by induction, 

i 

MSiW < Oh^{l + hL)'-^ since So = 0, 
< e{ex.p{Lhi) - l)/L. 

□ 



25 



